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Abstract 

In this paper, we present an efficient deterministic algorithm for consensus in presence of 
Byzantine failures. Our algorithm achieves consensus on an L-bit value with communication 
complexity OinL + n A L ' 5 + n 6 ) bits, in a network consisting of n processors with up to t 
Byzantine failures, such that t < n/3. For large enough L, communication complexity of the 
proposed algorithm approaches 0(nL) bits. In other words, for large L, the communication 
complexity is linear in the number of processors in the network. This is an improvement over 
the work of Fitzi and Hirt (from PODC 2006), who proposed a probabilistically correct multi- 
valued Byzantine consensus algorithm with a similar complexity for large L. In contrast to the 
algorithm by Fitzi and Hirt, our algorithm is guaranteed to be always error- free. Our algorithm 
require no cryptographic technique, such as authentication, nor any secret sharing mechanism. 
To the best of our knowledge, we are the first to show that, for large L, error-free multi- valued 
Byzantine consensus on an L-bit value is achievable with O(nL) bits of communication. 



*This research is supported in part by Army Research Office grant W-911-NF-0710287 and 
National Science Foundation award 1059540. Any opinions, findings, and conclusions or recom- 
mendations expressed here are those of the authors and do not necessarily reflect the views of the 
funding agencies or the U.S. government. 



1 Introduction 



This paper considers the multi-valued Byzantine consensus problem. The Byzantine consensus 
problem considers n processors, namely Pi, P n , of which at most t processors may be faulty and 
deviate from the algorithm in arbitrary fashion. Each processor Pj is given an L-bit input value Vi, 
and they want to agree on a value v such that the following properties are satisfied: 

• Termination: every fault-free Pi eventually decides on an output value v[, 

• Consistency: the output values of all fault-free processors are equal, i.e., for every fault-free 
processor P^,v\ = v' for some v', 

• Validity: if every fault-free Pi holds the same input Vi = v for some v, then v' = v. 

Algorithms that satisfy the above properties in all executions are said to be error-free. 

We are interested in the communication complexity of error- free consensus algorithms. Com- 
munication complexity of an algorithm is defined as the maximum (over all permissible executions) 
of the total number of bits transmitted by all the processors according to the specification of the 
algorithm. This measure of complexity was first introduced by Yao and has been widely used 
by the distributed computing community [H [10] . 

System Model: We assume network and adversary models commonly used in other related work 

ESSIE]. 

We assume a synchronous fully connected network of n processors, wherein the processor iden- 
tifiers are common knowledge. Every pair of processors are connected with a pair of directed 
point-to-point communication channels. Whenever a processor receives a message on such a di- 
rected channel, it can correctly assume that the message is sent by the processor at the other end 
of the channel. 

We assume a Byzantine adversary that has complete knowledge of the state of the other pro- 
cessors, including the L-bit input values. No secret is hidden from the adversary. The adversary 
can take over up to t processors (t < n/3) at any point during the algorithm. These processors 
are said to be faulty. The faulty processors can engage in any " misbehavior" 1 , i.e., deviations from 
the algorithm, including sending incorrect messages, and collusion. The remaining processors are 
fault-free and follow the algorithm. 

Finally, we make no assumption of any cryptographic technique, such as authentication and 
secret sharing. 

It has been shown that error-free consensus is impossible if t > n/3 [HIE]. 0(n 2 ) has been shown 
to be a lower bound on the number of messages needed to achieve error- free consensus [3]. Since 
any message must be of at least 1 bit, this gives a lower bound of 0(n 2 ) bits on the communication 
complexity of any binary (l-bit) consensus algorithm. 

In practice, agreement is sometimes required for longer messages rather than just single bits. 
For instance, the "value" being agreed upon may be a large file in a fault-tolerant distributed 
storage system. For instance, as [5] suggests, in a voting protocol, the authorities must agree on 
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the set of all ballots to be tallied (which can be gigabytes of data). Similarly, as also suggested in 
[5], multi- valued Byzantine agreement is relevant in secure multi-party computation, where many 
broadcast invocations can be parallelized and thereby optimized to a single invocation with a long 
message. 

The problem of achieving consensus on a single L-bit value may be solved using L instances 
of a 1-bit consensus algorithm. However, this approach will result in communication complexity 
of Q(n 2 L), since Q(n 2 ) is a lower bound on communication complexity of 1-bit consensus. In a 
PODC 2006 paper, Fitzi and Hirt [5] presented a probabilistically correct multi-valued consensus 
algorithm which improves the communication complexity to 0(nL) for sufficiently large L, at the 
cost of allowing a non-zero probability of error. Since Q.(nL) is a lower bound on the communication 
complexity of consensus on an L-bit value, this algortihm has optimal complexity for large L. In 
their algorithm, an L-bit value (or message) is first reduced to a much shorter message, using 
a universal hash function. Byzantine consensus is then performed for the shorter hashed values. 
Given the result of consensus on the hashed values, consensus on L bits is then achieved by requiring 
processors whose L-bit input value matches the agreed hashed value deliver the L bits to the other 
processors jointly. By performing initial consensus only for the smaller hashed values, this algorithm 
is able to reduce the communication complexity to 0(nL + n 3 (n + k)) where k is a parameter of the 
algorithm. However, since the hash function is not collision- free, this algorithm is not error-free. 
Its probability of error is lower bounded by the collision probability of the hash function. 

We improve on the work of Fitzi and Hirt [5] , and present a deterministic error- free consensus 
algorithm with communication complexity of O(nL) bits for sufficiently large L. Our algorithm 
always produce the correct result, unlike [5]. For smaller L, the communication complexity of our 
algorithms is 0(nL + n 4 L 0,5 + n 6 ) . To our knowledge, this is the first known error-free multi-valued 
Byzantine consensus algorithm that achieves, for large L, communication complexity linear in n. 

2 Byzantine Consensus: Salient Features of the Algorithm 

The goal of our consensus algorithm is to achieve consensus on an L-bit value (or message). The 
algorithm is designed to perform efficiently for large L. Consequently, our discussion will assume 
that L is "sufficiently large" (how large is "sufficiently large" will become clearer later in the paper) . 
We now briefly describe the salient features of our consensus algorithm, with the detailed algorithm 
presented later in Section [3l 

• Algorithm execution in multiple generations: To improve the communication complexity, con- 
sensus on the L-bit value is performed "in parts". In particular, for a certain integer D, the 
L-bit value is divided into L/D parts, each consisting of D bits. For convenience of presen- 
tation, we will assume that L/D is an integer. A sub-algorithm is used to perform consensus 
on each of these L>-bit values, and we will refer to each execution of the sub-algorithm as a 
"generation" . 

• Memory across generations: If during any one generation, misbehavior by some faulty pro- 
cessor is detected, then additional (and expensive) diagnostic steps are performed to gain 
information on the potential identity of the misbehaving processor (s). This information is 
captured by means of a diagnosis graph, as elaborated later. As the sub-algorithm is performed 
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for each new generation, the diagnosis graph is updated to incorporate any new information 
that may be learnt regarding the location of the faulty processors. The execution of the 
sub-algorithm in each generation is adapted to the state of the diagnosis graph at the start 
of the generation. 

• Bounded instances of misbehavior: With Byzantine failures, it is not always possible to im- 
mediately determine the identity of a misbehaving processor. However, due to the manner in 
which the diagnosis graph is maintained, and the manner in which the sub-algorithm adapts 
to the diagnosis graph, the t (or fewer) faulty processors can collectively misbehave in at 
most t(t + 1) generations, before all the faulty processors are exactly identified. Once a faulty 
processor is identified, it is effectively isolated from the network, and cannot tamper with 
future generations. Thus, t(t + 1) is also an upper bound on the number of generations in 
which the expensive diagnostic steps referred above may need to be performed. 

• Low-cost failure-free execution: Due to the bounded number of generations in which the 
faulty processors can misbehave, it turns out that the faulty processors do not tamper with 
the execution in a majority of the generations. We use a low-cost mechanism to achieve 
consensus in failure- free generations, which helps to achieve low communication complexity. In 
particular, we use an error detecting code-based strategy to reduce the amount of information 
the processors must exchange to be able to achieve consensus in the absence of any misbehavior 
(the strategy, in fact, also allows detection of potential misbehavior). 

• Consistent diagnosis graph maintenance: A copy of the diagnosis graph is maintained locally 
by each fault-free processor. To ensure consistent maintenance of this graph, the diagnostic 
information (elaborated later) needs to be distributed consistently to all the processors in 
the network. This operation itself requires a Byzantine broadcast algorithm that solves the 
"Byzantine Generals Problem" |7J. With this algorithm, a "source" processor broadcasts its 
message to all other processors reliably, even if some processors (including the source) may 
be faulty. For this operation we use an error-free 1-bit Byzantine broadcast algorithm that 
tolerates t < n/3 Byzantine failures with communication complexity of 0(n 2 ) bits [T]. 
This 1-bit broadcast algorithm is referred as BroadcastSingleJiit in our discussion. While 
Broadcast_Single_Bit is expensive, the cumulative overhead of Broadcast_Single_Bit is kept 
low by invoking it a relatively small number of times, when compared to L. 

We now elaborate on the error detecting code used in our algorithms, and also describe the 
diagnosis graph in some more detail. 

Error detecting code: We will use Reed-Solomon codes in our algorithms (potentially other 
codes may be used instead). Consider a (m, k) Reed-Solomon code in Galois Field GF(2 C ), where 
c is chosen large enough (specifically, m < 2 C — 1). This code encodes k data symbols from GF(2 C ) 
into a codeword consisting of m symbols from GF(2 C ). Each symbol from GF(2 C ) can be represented 
using c bits. Thus, a data vector of k symbols contains kc bits, and the corresponding codeword 
contains mc bits. 

Each symbol of the codeword is computed as a linear combination of the k data symbols, such 
that every subset of k coded symbols represent a set of linearly independent combinations of the k 
data symbols. This property implies that any subset of k symbols from the m symbols of a given 
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codeword can be used to determine the data vector corresponding to the codeword. Similarly, 
knowledge of any subset of k symbols from a codeword suffices to determine the remaining symbols 
of the codeword. So k is also called the dimension of the code. 

For a code C, let us denote CQ as the encoding function, and C _1 () as the decoding function. 
The decoding function can be applied so long as at least k symbols of a codeword are available. 

Diagnosis Graph: The fault-free processors' (potentially partial) knowledge of the identity of 
the faulty processors is captured by a diagnosis graph. A diagnosis graph is an undirected graph 
with n vertices, with vertex i corresponding to processor Pi. A pair of processors are said to "trust" 
each other if the corresponding pairs of vertices in the diagnosis graph is connected with an edge; 
otherwise they are said to "accuse" each other. 

Before the start of the very first generation, the diagnosis graph is initialized as a fully connected 
graph, which implies that all the n processors initially trust each other. During the execution of the 
algorithm, whenever misbehavior by some faulty processor is detected, the diagnosis graph will be 
updated, and one or more edges will be removed from the graph, using the diagnostic information 
communicated using the Broadcast_Single_Bit algorithm. The use of Broadcast_Single_Bit ensures 
that the fault-free processors always have a consistent view of the diagnosis graph. As we will show 
later, the evolution of the diagnosis graph satisfies the following properties: 

• If an edge is removed from the diagnosis graph, at least one of the processors corresponding 
to the two endpoints of the removed edge must be faulty. 

• The fault-free processors always trust each other throughout the algorithm. 

• If more than t edges at a vertex in the diagnosis graph are removed, then the processor 
corresponding to that vertex must be faulty. 

The last two properties above follow directly from the first property, and the assumption that there 
are at most t faulty processors. 

3 Multi- Valued Consensus 

In this section, we describe our consensus algorithm, present a proof of correctness. 

The L-bit input value V{ at each processor is divided into L/D parts of size D bits each, as 
noted earlier. These parts are denoted as v j(l), Vi(2), ■ ■ ■ ,Vi(L/D). 

Our algorithm for achieving L-bit consensus consists of L/D sequential executions of Algorithm 
[T] presented in this section (we will discuss the algorithm in detail below). Algorithm [T] is executed 
once for each generation. For the g-th generation (1 < g < L/D), each processor Pi uses Vi(g) as 
its input in Algorithm [TJ Each generation of the algorithm results in processor Pi deciding on g-th 
part (namely, v'i(g)) of its final decision value v[. 

The value Vi(g) is represented by a vector of n — 2t symbols, each symbol represented with 
D/(n — 2t) bits. For convenience of presentation, we assume that D/(n — 2t) is an integer. We will 
refer to these n — 2t symbols as the data symbols. 
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A (n, n — 2i) distance- (2t + 1) Reed-Solomon code, denoted as C 2 t, is used to encode the n — 2t 
data symbols into n coded symbols. We assume that D/(n — 2t) is large enough to allow the above 
Reed-Solomon code to exist, specifically, n < 2- D /( n ~ 2 *) — 1. This condition is met only if L is large 
enough (since L > D). 

We now present some notations to be used in our discussion below. For a m-element vector 
V, we denote V[j] as the j-th element of the vector, 1 < j < m. Given a subset A C {1, . . . , m}, 
denote V/A as the ordered list of elements of V at the locations corresponding to elements of A. For 
instance, if m = 5 and A = {2,4}, then V/A is equal to (U[2],U[4]). We will say that V/A G C 2t 
if there exists a codeword Z € C 2 t such that Z/A = V/A. Otherwise, we will say that V/A £ C 2 t- 
Suppose that Z is the codeword corresponding to data v. This is denoted as Z = C 2 t(v), and 
v = C^t (Z). We will extend the definition of the inverse function C^ 1 as follows. When set A 
contains at least n — It elements, we will define C^iV/A) = v, if there exists a codeword Z E C 2 t 
such that Z/A = V/A and C 2t {v) = Z. 

Let the set of all the fault-free processors be denoted as P goo d- 

Algorithm Q] for each generation g consists of three stages. We summarize the function of these 
three stages first, followed by a more detailed discussion: 

1. Matching stage: Each processor Pj encodes its D-bit input Vi(g) for generation g into n coded 
symbols, as noted above. Each processor Pj sends one of these n coded symbols to the other 
processors that it trusts. Processor Pj trusts processor P, if and only if the corresponding 
vertices in the diagnosis graph are connected by an edge. Using the symbols thus received 
from each other, the processors attempt to identify a "matching set" of processors (denoted 
P-match) of size 7i — t such that the fault- free processors in P m atch are guaranteed to have 
an identical input value for the current generation. If such a Pmatch is not found, it can be 
determined with certainty that all the fault-free processors do not have the same input value 
- in this case, the fault-free processors decide on a default output value and terminate the 
algorithm. 

2. Checking stage: If a set of processors P ma tch is identified in the above matching stage, each 
processor Pj ^ Pmatch checks whether the symbols received the from processors in Pmatch 
correspond to a valid codeword. If such a codeword exists, then the symbols received from 
Pmatch are said to be "consistent" . If any processor finds that these symbols are not consistent, 
then misbehavior by some faulty processor is detected. Else all the processors are able to 
correctly compute the value to be agreed upon in the current generation. 

3. Diagnosis stage: When misbehavior is detected in the checking stage, the processors in P ma tch 
are required to broadcast the coded symbol they sent in the matching stage, using the Broad- 
cast-Single-Bit algorithm. Using the information received during these broadcasts, the fault- 
free processors are able to learn new information regarding the potential identity of the faulty 
processor(s). The diagnosis graph (called Diag-Graph in Algorithm [T]) is updated to incorpo- 
rate this new information. 

In the rest of this section, we discuss each of the three stages in more detail. Note that whenever 
algorithm Broadcast_Single-Bit is used, all the fault-free processors will receive the broadcasted 
information identically. One instance of Broadcast_Single_Bit is needed for each bit of information 
broadcasted using Broadcast_Single-Bit . 
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Algorithm 1 Multi- Valued Consensus (generation g) 



1. Matching Stage: 

Each processor Pj performs the matching stage as follows: 

(a) Compute (Sj[l], . . . , Si[n]) = C2t(vi(g)), and send Si[i] to every trusted processor Pj 
_ r J symbol that Pj receives from Pj, if Pj trusts P,-; 

(bj j^uj <- | ±)Qtherwise 

(c) If = Pj[j] then Mj[j] <- true ; else Mj[j] <- false 

(d) Pj broadcasts the vector Mj using Broadcast_Single_Bit 

Using the received M vectors: 

(e) Find a set of processors Pmatch of size n — t such that 

Mj[fe] = M k [j] = true for every pair of Pj,Pk G P ma tch 

(f) If Pmatch does not exist then decide on a default value and terminate; 
else enter the Checking Stage 

2. Checking Stage: 

Each processor Pj £ Pmatch performs steps 2(a) and 2(b): 

(a) If Rj / Pmatch G C*2t then Detectedj <— false ; else Detectedj <— true . 

(b) Broadcast Detectedj using BroadcastSingle-Bit . 

Each processor Pj performs step 2(c): 

(c) Receive Detectedj from each processor Pj £ Pmatch (broadcasted in step 2(b)). 
If Detectedj = false for all Pj £ P ma tch, then decide on v[{g) = C 2t l (Ri / P m atch); 
else enter Diagnosis Stage 

3. Diagnosis Stage: 

Each processor Pj G Pmatch performs step 3(a): 

(a) Broadcast Sj[j] using Broadcast_Single-Bit 

(one instance of Broadcast_Single-Bit is needed for each bit of ^ [j]) 

Each processor Pj performs the following steps: 

(b) R^[j] <— symbol received from Pj G Pmatch as a result of broadcast in step 3(a) 

(c) For all Pj G P ma tch, 

if Pj trusts Pj and Ri[j] = P*[i] then Trusti[j] ^— true ; 
else Trusti[j] <— false 

(d) Broadcast Trusty /Pmatch using Broadcast_Single-Bit 

(e) For each edge (J, k) in Diag-Graph , 

remove edge (j,k) if Trust j[k] = false or Trust k[j] = false 

(f) If R* I Pmatch G Cat then 

if for any Pj £ P matc/l , 

Detectedj = true , but no edge at vertex j was removed in step 3(e) 
then remove all edges at vertex j in Diag_Graph 

(g) If at least t + 1 edges at any vertex j have been removed so far, 
then processor Pj must be faulty, and all edges at j are removed. 

(h) Find a set of processors Pdedde C Pmatch of size n — 2i in the updated DiagjGraph, 
such that every pair of Pj, P& G Pdedde trust each other. 

(i) Decide on ^(#) = C^ 1 {R* / Pdedde) ■ 
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3.1 Matching Stage 



The line numbers referred below correspond to the line numbers for the pseudo-code in Algorithm 

m 

Line 1(a) : In generation g, each processor Pj first encodes Vi(g), represented by n — t symbols, 
into a codeword Si from the code Cit- The j-th symbol in the codeword is denoted as Si[j]. Then 
processor Pj sends Si[i], the i-th symbol of its codeword, to all the other processors that it trusts. 
Recall that Pj trusts Pj if and only if there is an edge between the corresponding vertices in the 
diagnosis graph (referred as Diag.Graph in the pseudo-code). 

Line 1(b) : Let us denote by Ri[j] the symbol that Pj receives from a trusted processor Pj. 
Processor Pj ignores any messages received from untrusted processors, treating the message as a 
distinguished symbol _L. 

Line 1(c): Flag Mj[j] is used to record whether processor Pj finds processor P,-'s symbol consistent 
with its own local value. Specifically, the pseudo-code in line 1(c) is equivalent to the following: 

• When P trusts P,: If Ri[j] = Si\j], then set Mi[j] = true ; else Mi[j] = false . 

• When Pj does not trust Pf Mj[j] = false . 

Line 1 (d) : As we will see later, if a fault-free processor Pj does not trust another processor, then 
the other processor must be faulty. Thus entry Mi[j] in vector Mj is false if Pj believes that 
processor Pj is faulty, or that the value at processor Pj differs from the value at Pj. Thus, entry 
Mi[j] being true implies that, as of this time, Pj believe that Pj is fault-free, and that the value 
at Pj is possibly identical to the value at Pj. Processor Pj uses Broadcast_Single-Bit to broadcast 
Mj to all the processors. One instance of Broadcast_Single_Bit is needed for each bit of Mj. 

Lines 1(e) and 1(f): Due to the use of Broadcast_Single_Bit , all fault-free processors receive 
identical vector Mj from each processor Pj. Using these M vectors, each processor Pj attempts 
to find a set P ma tch containing exactly n — t processors such that, for every pair Pj,Pk € Pmatch, 
Mj[k] = Mfcfj] = true . Since the M vectors are received identically by all the fault-free processors 
(using BroadcastSingle-Bit ), they can compute identical Pmatch- However, if such a set P ma tch 
does not exist, then the fault-free processors conclude that all the fault- free processors do not have 
identical input - in this case, they decide on some default value, and terminate the algorithm. In 
the following discussion, we will show the correctness of this step. 

In the proof of the lemmas Q] and [2j we assume that the fault-free processors (that is, the 
processors in set P goo d) always trust each other - this assumption will be shown to be correct later 
in Lemma [H 

Lemma 1 If for each fault-free processor Pi £ P goo d, Vi(g) = v(g), for some value v(g), then a set 
Pmatch necessarily exists (assuming that the fault-free processors trust each other). 

Proof: Since all the fault-free processors have identical input v(g) in generation g, Si = C2t(v(g)) 
for all Pj G Pgood- Since these processors are fault-free, and trust each other, they send each other 
correct messages in the matching stage. Thus, Ri[j] = Sj\j] = Si\j] for all Pi,Pj € Pgood- This fact 
implies that Mj[j] = true for all Pi,Pj £ P g0 od- Since there are at least n — t fault-free processors, 
it follows that a set Pmatch of size n — t must exist. □ 
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Observe that, although the above proof shows that there exists a set Pmatch containing only 
fault-free processors, there may also be other such sets that contain some faulty processors as well. 
That is, all the processors in Pmatch cannot be assumed to be fault-free. 

Converse of Lemma [1] implies that, if a set P m atch does not exist, it is certain that the fault-free 
processors do not have the same input values. In this case, they can correctly agree on some default 
value and terminate the algorithm. This proves the correctness of Line 1(f). 

In the case when a set P ma tch is found, the following lemma is useful. 

Lemma 2 The fault-free processors in P ma tch (that is, all the processors in P ma tch H Pgood) have 
the same input for generation g. 

Proof: \P m atch^Pgood\ > n — 2t because \P ma tch\ =n—t and there are at most t faulty processors. 
Consider any two processors Pi,Pj £ Pmatch n Pgood- Since — JWj'[i] — true , it follows that 

Si[i] = Sj[i] and Sj[j] = Si[j]. Since there are n — 2t fault-free processors in Pmatch H P goo di this 
implies that the codewords computed by these fault-free processors (in Line 1(a)) contain at least 
n — It identical symbols. Since the code Cit has dimension (n — 2t), this implies that the fault-free 
processors in P ma tch H Pgood must have identical input in generation g. □ 

3.2 Checking Stage 

When P ma tch is found during the matching stage, the checking stage is entered. 

Lines 2(a) and 2(b): Every fault-free processor Pa £ Pmatch checks whether the symbols re- 
ceived from the trusted processors in Pmatch are consistent with a valid codeword: that is, check 
whether Rj / Pmatch £ Cn. The result of this test is broadcasted as a 1-bit notification Detectedi, 
using Broadcast_Single-Bit . If Rj/P ma tch $ Cit, then processor Pj is said to have detected an 
inconsistency. 

Line 2(c) : If no processor announces in Line 2(b) that it has detected an inconsistency, each 
fault-free processor Pi chooses C^. 1 (Ri / Pmatch) as its decision value for generation g. 
The following lemma argues correctness of the decision made in Line 2(c). 

Lemma 3 If no processor detects inconsistency in Line 2(a), all fault- free processors Pi £ P goo d 
decide on the identical output value v'(g) such that v'(g) = Vj(g) for all Pj £ Pmatch H Pgood- 

Proof: Observe that size of set P ma tch H Pgood is at least n — 2t, and hence the inverse operations 
C^ 1 (Ri /Pmatch) and C^ 1 (Ri / Pmatch n P goo d) are both defined. 

Since fault-free processors send correct messages, Ri/ Pmatch H Pgood are identical for all fault- 
free processors Pj € P goo d- Since no inconsistency has been detected by any processor, every 
fault-free processor Pi decides on C^ 1 (Ri / Pmatch) as its output. Since C%t has dimension (n — 2t), 
C^ 1 (Ri/ Pmatch) = (Ri / Pmatch H Pgood)- It then follows that all the fault-free processors Pi 
decide on the identical value v'(g) = (Ri I 'Pmatch^ Pgood) in Line 2(c). Since Rj / P ma tch^ Pgood = 
Sj/ Pmatch n Pgood for all processors Pj £ Pmatch n Pgood, v'(g) = Vj(g) for all Pj £ Pmatch O P goo d- a 
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3.3 Diagnosis Stage 



When any processor that is not in P ma tch announces that it has detected an inconsistency, the 
diagnosis stage is entered. The algorithm allows for the possibility that a faulty processor may 
erroneously announce that it has detected an inconsistency. The purpose of the diagnosis stage is 
to learn new information regarding the potential identity of a faulty processor. The new information 
is used to remove one or more edges from the diagnosis graph Diag-Graph - as we will soon show, 
when an edge (j, k) is removed from the diagnosis graph, at least one of Pj and P^ must be faulty. 
We now describe the steps in the Diagnosis Stage. 

Lines 3(a) and 3(b) : Every fault-free processor Pj £ Pmatch uses Broadcast_Single_Bit to broad- 
cast Sj[j] to all processors. Let us denote by R^[j] the result of the broadcast from Pj. Due to 
the use of BroadcastSingk-Bit , all fault-free processors receive identical RP[j] for each processor 
Pj G Pmatch- This information will be used for diagnostic purposes. 

Line 3(c) and 3(d) : Every fault-free processor Pi uses flag Trusti[j] to record whether it "be- 
lieves" , as of this time, that each processor Pj € Pmatch is fault-free or not. Then Pj broadcasts 
Trusti/ 'Pmatch to all processors using BroadcastSingle_Bit . Specifically, 

• If Pi trusts Pj and Ri [j] = i?# [j] , then set Trusti [j] = true ; 

• If Pi does not trust Pj or Ri[j] / R^[j], then set Trusti[j] =false . 

Line 3(e) : Using the Trust vectors, each fault-free processor Pi then removes any edge (j, k) 
from the diagnosis graph such that Trust j[k] or Trustk[j] = false . Due to the used of Broad- 
castSingle_Bit , all fault-free processors receive identical Trust vectors. Hence they will remove 
the same set of edges and maintain an identical view of the updated Diag-Graph . 

Line 3(f ) : As we will soon show, in the case R^ /Pmatch £ Cit-, a processor Pj ^ Pmatch that 
announces that it has detected an inconsistency, i.e., Detected j =true , must be faulty if no edge 
attached to vertex j was removed in Line 3(e). Such processors Pj are "isolated 1 , by having all edges 
attached to vertex j removed from Diag_Graph , and the fault-free processors will not communicate 
with it anymore in subsequent generations. 

Line 3(g) : As we will soon show, a processor Pj must be faulty if at least t + 1 edges at vertex j 
have been removed. The identified faulty processor Pj is then isolated. 

Lines 3(h) and 3(i) : Since Diag-Graph is updated only with information broadcasted with 
Broadcast_Single_Bit (Detected, and Trust), all fault-free processors maintain an identical view 
of the updated Diag-Graph . Then they can compute an identical set Pdetide C Pmatch containing 
exactly n — 2t processors such that every pair Pj,Pk € Pdetide trust each other. Finally, every 
fault-free processor chooses C^. (R^/ Pdetide) as its decision value for generation g. 
We first prove the following property of the evolution of Diag-Graph . 

Lemma 4 Every time the diagnosis stage is performed, at least one edge attached to a vertex 
corresponding to a faulty processor will be removed from DiagJGraph, and only such edges will be 
removed. 

Proof: We prove this lemma by induction. For the convenience of discussion, let us say an edge 
(j, k) is "bad' if at least one of Pj and Pk is faulty. 
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Consider a generation g starting with any instance of the Diag-Graph in which only bad edges 
have been removed. When the diagnosis stage is performed, there are two possibilities: (1) a fault- 
free processor Pj ^ Pmatch detects an inconsistency; or (2) a faulty processor Pj ^ Pmatch announces 
that it has detected an inconsistency. We consider the two possibilities separately: 

1. A fault-free processor Pi P ma tch detects an inconsistency: In this case, Ri/ Pmatch £ Cit- 
However, according to the definition of P ma tch, Rk/ Pmatch = S k / Pmatch G C 2 t for every 
processor Pk € Pmatch^ Pgood- This implies that there must be a faulty processor Pj € Pmatch, 
which is trusted by Pi and Pk, has sent different symbols to the fault-free processors Pi and 
Pk during the matching stage. Thus, the R^[j] must be different from at least one of Ri[j] 
and Rk[j]- As a result, Trusti[j] = false or Trustk[j] = false . Then at least one of the bad 
edges and (j, k) will be removed in Line 3(e). 

2. A faulty processor Pj (/. Pmatch announces that it detects an inconsistency: Denote by X C 
Pmatch the set of processors S Pmatch that Pj trusts. According to the algorithm, either an 
bad edge (j, k) for some Pk € X was removed in Line 3(e), or none of such edges is removed. 
In the former case, the bad edge (j, k) is removed. In the later case, there are two possibilities 

(a) R# /Pmatch G C2t- Given that no edge (j, k) for every Pk G X was removed in Line 3(e), 
one can conclude that, if Pj is fault-free, then Trust j[k] =true for all Pk G X, and 
Rj[k]/X = RjF[k]/X € Cit- On the other hand, observe that Pj computes Detected j by 
checking whether Rj/X £ C^t, since any message from untrusted processors in P ma tch 
should have been ignored by Pj in Line 1(b). From Detected j = true , one can conclude 
that, if Pj is fault-free, Rj/X £ C<it- Now we have a contradiction if Pj is fault-free. 
So processor Pj must be faulty and all edges at vertex j are bad. These bad edges are 
removed in Line 3(f). 

(b) R^/ Pmatch ^ C2t- In this case, similar to the discussion in case 1, some bad edge 
connecting two vertices corresponding to processors in P m atch is removed in Line 3(e). 

So by the end of Line 3(f), at least one new bad edge has been removed. Moreover, since 
Ri[k] = R&[k] for all fault-free processors Pk G Pmatch H P g0 od, Trusti[k] remains true for every 
pair of processors Pi,Pk G P g0 od, which implies that the vertices corresponding to the fault-free 
processors will remain fully connected, and each will always have at least n — t — 1 edges. This 
follows that a processor Pj must be faulty if at least t + 1 edges at vertex j has been removed. So 
all edges at j are bad and will be removed in Line 3(g). 

Now we have proved that for every generation that begins with a Diag-Graph in which only bad 
edges have been removed, at least one new bad edge, and only bad edges, will be removed in the 
updated DiagjGraph by the end of the diagnosis stage. Together with the fact that DiagJGraph 
is initialized as a complete graph, we finish the proof. □ 

The above proof of Lemmad] shows that all fault-free processors will trust each other throughout 
the execution of the algorithm, which justifies the assumption made in the proofs of the previous 
lemmas. The following lemma shows the correctness of Lines 3(h) and 3(i). 

Lemma 5 By the end of diagnosis stage, all fault-free processors Pi € P g0 od decide on the same 
output value v'(g), such that v'{g) = Vj(g) for all Pj G Pmatch n Pgood- 
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Proof: First of all, the set Pdetide necessarily exists since there are at least n — 2t > t + 1 fault- 
free processors in P ma tch H Pgood that always trust each other. Secondly, since the size of Pdetide is 
n— 2t > t+1, it must contain at least one fault-free processor P^ £ Pdetide^Pgood- Since Pk still trusts 
all processors of Pdetide in the updated Diag.Graph, R* /Pdetide = Rk/ Pdetide = S k / Pdetide- The 
second equality is due to the fact that Pk € P ma tch- Finally, since the size of set Pdetide is n — 2t, the 
inverse operation of {R* j P decide) is defined, and it equals to C^ 1 (Sk / Pdecide) = Vk(g) = Vj(g) 
for all Pj £ P m atch n Pgood, as per Lemma EJ □ 
We can now conclude the correctness of the Algorithm [TJ 

Theorem 1 Given n processors with at most t < n/3 are faulty, each given an input value of 
L bits, Algorithm^ achieves consensus correctly in L/D generations , with the diagnosis stage 
performed for at most t(t + 1) times. 

Proof: According to Lemmas Q] to El consensus is achieved correctly for each generation g of D 
bits. So the termination and consistency properties are satisfied for the L-bit outputs after L/D 
generations. Moreover, in the case all fault-free processors are given an identical L-bit input v, the 
D bits output v'(g) in each generation g equals to v(g) as per Lemmas [TJ [3] and [5j So the L-bit 
output v' = v and the validity property is also satisfied. 

According to Lemma 0] and the fact that a faulty processor Pj will be removed once more than t 
edges at vertex j have been removed, it takes at most t(t + 1) instance of the diagnosis stage before 
all faulty processors are identified. After that, the fault-free processors will not communicate with 
the faulty processors. Thus, the diagnosis stage will not be performed any more. So it will be 
performed for at most t(t + 1) times in all cases. □ 

3.4 Complexity 

We have discussed the operations of the proposed multi-valued consensus algorithm above. Now 
let us study the communication complexity of this algorithm. Let us denote by B the complexity of 
broadcasting 1 bit with one instance of Broadcast_Single-Bit . In every generation, the complexity 
of each stage is as follows: 

• Matching stage: every processor Pi sends at most n — 1 symbols, each of D/(n — 2t) bits, to 
the processors that it trusts, and broadcasts n—1 bits for Mj. So at most n ^ L ~^ > D+n(n — \)B 
bits in total are transmitted by all n processors. 

• Checking stage: every processor Pj £ P ma tch broadcasts one bit Detectedj with Broad- 
cast_Single_Bit , and there are t such processors. So tB bits are transmitted. 

• Diagnosis stage: every processor Pj € Pmatch broadcasts one symbol Sj\j] of D/(n — 2t) bits 
with Broadcast_Single-Bit ; and every processor Pj broadcasts n — t bits of Trusti/ P ma tch 
with Broadcast_Single-Bit . So the complexity is ^^DB + n(n — t)B bits. 

According to Theorem [TJ there are L/D generations in total. In the worst case, P m atch can be 
found in every generation, so the matching and checking stages will be performed for L/D times. In 
addition, the diagnosis stage will be performed for at most t(t + 1) time. Hence the communication 
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complexity of the proposed consensus algorithm, denoted as C con (L), is then computed as 

Ccon(L) = ( n{n ~ 1} D + n(n — 1)5 + tB)^+ t(t + 1) (^-D + n(n - t)) B (1) 
V n — It ) D \n — 2t J 

For a large enough value of L, with a suitable choice of D = y ^" ^+1^-1)^ > we ^ ave 

C con (L) = ^JlL + 2BL°\ k 2 ~ n + + ^ ~ + t(t + l)n(n - t)B (2) 
n — 2t y n — It 

Error-free algorithms that broadcast 1 bit with communication complexity 0(n 2 ) bits are known 
[TJE]. So we assume -B = G(n 2 ). Then the complexity of our algorithm for t < n/3 becomes 

C C on(^) = re(ra ~ 1} £ + 0(n 4 L ' 5 + n 6 ) = 0(nL + n 4 L°' 5 + n 6 ). (3) 
n — 2t 

So for sufficiently large L (0(n 6 )), the communication complexity approaches 0(nL). 

4 Mult i- Valued Broadcast and Tolerating t > n/3 Failures 

Here we briefly discuss the Byzantine broadcast problem (also known as the "Byzantine Generals 
Problem" [7J). Similar to the consensus problem, the broadcast problem also considers achieving 
agreement among n processors: A designated "source" processor tries to broadcast an L-bit value 
to the other processors, while t < n/3 processors (probably including the source) may be faulty. 
Using techniques introduced in this paper, we can achieve error-free multi-valued broadcast with 
communication complexity Cfe ro (L) < 1.5(n— l)L+0(n 4 L 0,5 ) bits for t < n/3 and large L [8]. Notice 
that the complexity of any broadcast algorithm, even the ones that allow a positive probability of 
error, is lower bounded by (n — 1)L. So we can achieve error-free broadcast with complexity within 
a factor of 1.5 + e to the optimal for any constant e > and sufficiently large L. 

Most of our discussion in the previous section is independent of the number of faulty processors. 
The requirement for t < n/3 is needed only for the correctness of the deterministic error- free 1-bit 
broadcast algorithm Broadcast_Single-Bit . In practice, it may be desirable to be able to tolerate 
t > n/3 failures at the cost of a non-zero probability of error. This need can be met by our algorithm 
with a small modification: substitute Broadcast_Single_Bit with any probabilistically correct 1-bit 
broadcast algorithm that tolerates the desired number of failures (ones with authentication from 
[101 for example). With this modification, our algorithm tolerates the same number of failures as 
the 1-bit broadcast algorithm does, and makes an error only if the 1-bit broadcast algorithm fails. 
The only difference in the communication complexity is the term sub-linear in L. So for sufficiently 
large L, the complexity of the modified algorithm is also 0(nL). 

5 Conclusion 

In this paper, we present efficient error-free Byzantine consensus algorithm for long messages. The 
algorithm requires 0(nL) total bits of communication for messages of L bits for sufficiently large L. 
Our algorithm makes no cryptographic assumption and still is able to always solve the Byzantine 
consensus problem correctly. 
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